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Abstract 

A generalization of Arikan's polar code construction using transformations of the form G® ra 
where G is an I x I matrix is considered. Necessary and sufficient conditions are given for 
these transformations to ensure channel polarization. It is shown that a large class of such 
transformations polarize symmetric binary-input memoryless channels. 

1 Introduction 

Polar codes, introduced by Arikan in pQ, are the first provably capacity achieving codes for arbitrary 
symmetric binary-input discrete memoryless channels (B-DMC) with low encoding and decoding 
complexity. Polar code construction is based on the following observation: Let 
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Consider applying the transform G® n (where "® n " denotes the n th Kronecker power) to a block of 
N = 2 n bits and transmitting the output through independent copies of a B-DMC W (see Figure 
[T]). As n grows large, the channels seen by individual bits (suitably defined in pQ) start polarizing: 
they approach either a noiseless channel or a pure-noise channel, where the fraction of channels 
becoming noiseless is close to the symmetric mutual information I(W). 

It was conjectured in [TJ that polarization is a general phemonenon, and is not restricted to the 
particular transformation Gf n . In this note we give a partial affirmation to this conjecture. In 
particular, we consider transformations of the form G® n where G is an t x I matrix for t > 3 and 
provide necessary and sufficient conditions for such Gs to polarize symmetric B-DMCs. 



2 Preliminaries 

Let W : {0, 1} — > y be a B-DMC. Let I(W) & [0, 1] denote the mutual information between the 
input and output of W with uniform distribution on the inputs. Also let Z(W) G [0, 1] denote the 
Bhattacharyya parameter of W, i.e., Z(W) = J2 y€ y \/W(y\Q)W(y\\). 

Fix an £ > 3 and an invertible i x t {0, 1} matrix G. Consider a random ^-vector Uf that 
is uniformly distributed over {0, 1}^. Let Xf = UfG, where the multiplication is performed over 
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Figure 1: 

GF(2). Also let Y( be the output of I uses of W with the input Xf. Observe now that the channel 
between U[ and Y( is defined by the transition probabilities 

e e 
W e (y{ I u{) ±Y[W( yi | x t ) =Y[w( yi | {u{G)i). 
i=i i=i 

Define W^> : {0, 1} — > y i x {0, as the channel with input Uj, output (y^,^ -1 ) and transition 

probabilities 



and let denote its Bhattacharyya parameter, i.e., 



y{ ,«! 

For > 1, let PF fc : {0, 1} — > J 7 * 1 denote the B-DMC with transition probabilities 

k 

W k (y k 1 \x) = l[W(y j \x). 

3=1 

Also let W"W : {0, 1} — > 3^ denote the B-DMC with transition probabilities 
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Observation 1. // W is symmetric, then the channels and W^W are equivalent in the sense 



that for any fixed u\ 1 there exists a permutation vr^-i : y 1 — ► y* such that 
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Finally, let denote the mutual information between the input and output of channel ■ 
Since G is invertible, it is easy to check that 

i 

J^jW =U(W). 

i=l 

3 Polarization 

We will say that G is a polarizing matrix if there exists an i E {1, . . . , £} for which is equivalent 
to W k for some k > 2, in the sense that 

W^(y{\u i ) = cl[W(y j \u i ) (3) 

for some constant c and A C {1, ...,£} with \ A\ = k. IfW is symmetric, then Observation [T] implies 
the equivalence of and W k (which we denote by = W k ) in the sense that 

W^iylul 1 | m) = [J W((7r url (y% | Ui ). (4) 

Note that the equivalence W® = W k implies jW = I(VF fc ) and ZW = 

It will be shown that channel transformations of the form G® n polarize symmetric channels if 
and only if G is polarizing. This statement is made precise in the following theorem: 

Theorem 1. Fix a symmetric B-DMC W . Let G® n denote the n th Kronecker power of G and 
consider the transformation G® n : W — > (W^ : i = 1, . . . ,£ n ). 

i. If G is polarizing, then for any 5 > 

#{ie{l,...,e i }-.l(W®)e(6,l-5)} 

hm — = 0. 

ii. If G is not polarizing, then 

I(W®) = I(W) for all n and i € {1, . . . , £ n }. 

Theorem [T] is a direct consequence of Lemmas [Hand [2] below. 

Note that any invertible {0, 1} matrix G can be written as a (real) sum G = P + P', where P 
is a permutation matrix, and P' is a {0, 1} matrix. This fact can be inferred from Hall's Theorem 
[3j Theorem 16.4.]. Therefore, for any such matrix G, there exists a column permutation that 
results in Ga = 1 for all i. Since the transition probabilities defining are invariant (up to a 
permutation of the outputs y\ ) under column permutations on G, we only consider matrices with 
Is on the diagonal. 

The following lemma gives necessary and sufficient conditions for Q to be satisfied: 
Lemma 1. For any symmetric B-DMC W , 
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i. If G is not upper triangular, then there exists an i for which = W k for some k > 2. 

ii. If G is upper triangular, then = W for all 1 < i < £. 

Proof. Let G^~^ be the (£ — i) x {£ — i) matrix obtained from G by removing its last i rows and 
columns. Let the number of Is in the last row of G be k. Clearly W<$ = W k . If k > 2 then 
G is not upper triangular and the first claim of the lemma holds. If k = 1, then = W, and 

(xi, . . . , X£-i) is independent of U£. One can then write 

wV-Hylu[- i - 1 \^-i) = ^ E W t (y{\u\) 

= E = V[- 1 I Ui = A Pr\V t = y e \ = y[~\ U{ = u{] 

t-i 

u t- i+ v u e 

( = } ^T E Wi-i(^- 1 l«5- 1 )Pr[ii = wl^- 1 = ^- 1 ,^ = t4] 



l - e w i _u- 1 \<- 1 )Y.^ = y^ Y t 1 = y[- 1 M = u[\ 



u t-i+i 



[W(y e | 0) + W(y e | 1)] E I "I" 1 ) 



2«- 

where (a) follows from the fact that = 0, for all k < £. Therefore yi is independent of 
the inputs to the channels W^~ 1 ^ for i = 1, ...,£ — 1. This is equivalent to saying that channels 
are defined by the matrix Applying the same argument to G^" 1 ) and 

repeating, we see that if G is upper triangular, then we have = W for all i. On the other 
hand, if G is not upper triangular, then there either exists an i for which G^^ has at least two Is 
in the last row, which in turn implies = W k for some k > 2. □ 

Remark 1. The above lemma says that all transformations that are not upper triangular are 
polarizing. Moreover, upper triangular transformations have no effect on the channel, i.e., each bit 
sees an independent copy of W after an upper triangular transformation. 

Corollary 1. For any polarizing transformation G, there exists an i € {1, . . . ,£} and k > 2 for 
which 

/« = I(W k ) (5) 

Z« = Z{W) k . (6) 

Proof. The first claim is trivial. The second claim follows from the fact that the Bhattacharyya 
parameter of any channel of the form n • Wj is given by f] Z(Wj). □ 
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4 Convergence 

Consider recursively combining channels W as in [I] , using a polarizing transformation G. Following 
Arikan, associate to this construction a tree process {W n ;n > 0} with 

W = W 
W n+1 = 

where {B n ;n > 1} is a sequence of i.i.d. random variables defined on a probability space (J), J 7 , /z), 
B n being uniformly distributed over the set {1, . . . , £}. Define JT = {0, S7} and T n = o{B\, . . . , B n ) 
for n > 1. Define the processes {I n ;n > 0} = {I(W n );n > 0} and {Z„;n > 0} = {Z(W n );n> 0}. 



Observation 2. {(J^,^)} is a bounded martingale and therefore converges a.s. and in C l to a 
random variable 1^. 

Lemma 2. IfWis symmetric and G is polarizing, then 

Jl w.p. I(W), 
[0 w.p. l-J(W). 

Proof. By the convergence in £ x of I n we have E[|/ n +i — J ra |] n -Z^3? 0. Since G is a polarizing matrix, 
Lemma [1] implies 

I n+ i = I(Wn) with probability at least -, 

for some k > 2. This in turn implies 

E[|/ n+ i -I„|]> ^E[/(W n fc ) - I(W n )] - 0. (7) 

It is shown in the Appendix that for any symmetric B-DMC W n , if I(W n ) G (5, 1 — 5) for some 
5 > 0, then there exists an r/((5) > such that I(W*) — I(W n ) > T](5). We therefore conclude that 
convergence in ([7]) implies 1^ G {0, 1} w.p. 1. The claim on the probability distribution of 1^ 
follows from the fact that {I n } is a martingale, i.e., Ef/oo] = E[io] = I(W). □ 

Corollary 2. If W is symmetric and G is polarizing, then {Z n } converges a.s. to a random variable 
Zoo and 

_ JO w.p. I(W), 
°° \l w.p. l-I(W). 

Proof. The proof follows from the fact that /„ — * 1^ a.s. and the inequalities [l] 

I{Q) 2 + Z{Q) 2 < 1 
I(Q) + Z(Q) > 1. 

for any B-DMC Q. □ 
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Theorem 2. Given a symmetric B-DMC W , an £ x £ polarizing matrix G, and any (3 < l/£, 



lim Pr[Z n < 2-**] = I(W). 

n—KX> 

Proof Idea. For any polarizing matrix it can be shown that Z n+ i < £Z n with probability 1 and 
that Z n+ i < with probability at least l/£. The proof then follows by adapting the proof of [21 
Theorem 3]. □ 



5 Discussion 

Using Arikan's rule for choosing the information bits, polar codes of blocklength N = £ n can be 
constructed starting with any polarizing £ x £ matrix G. The encoding and successive cancellation 
decoding complexities of such codes are 0(N log N). Using similar arguments, it is easy to show 
that polar codes of blocklength N = YYi=i &i can be constructed from generator matrices of the form 
®iGi, where each Gi is a polarizing matrix of size £i x £^. The encoding and successive cancellation 
decoding complexities of these codes are also O(NlogN). 



Appendix 

In this section we prove the following: 

Lemma 3. Let W be a symmetric B-DMC and let W k be defined as above. If I(W) £ (5,1 — 5) 
for some 5 > 0, then there exists an rj(5) > such that I(W k ) — I(W) > T](S). 

We will use the following theorem in proving Lemma El 

Theorem 3 ([HE]). Let Wi, . . . , be k symmetric B-DMCs with capacities I±, . . . , respectively. 
Let denote the channel with transition probabilities 

k 

wW(y k \x) = Y[W, t (y t \x). 

i=l 

Ik] 

Also let W BSC denote the channel with transition probabilities 

k 

wUdVi \x) = t[W BS C( ei )(Vi 

i=l 

where BSC(ei) denotes the binary symmetric channel with crossover probability e, € [0,^], e, = 
h~ l (l — Ij), where h denotes the binary entropy function. Then, I(W^) > I(WbIq)- 

Remark 2. Consider the transmission of a single bit X using k independent symmetric B-DMCs 
W\, . . . , Wk with capacities Ii,...,Ij~. Theorem^ states that over the class of all symmetric channels 
with given mutual informations, the mutual information between the input and the output vector is 
minimized when each of the individual channels is a BSC. 
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Proof of Lemma\^ Let e G [0, g] be the crossover probability of a BSC with capacity I(W), i.e., 
e = h' l {l - I(W)). Note that for k>2, 

I(W k ) > I(W 2 ). 

By Theorem^ we have I(W 2 ) > I(WbscU))' ^ simple computation shows that 

I(W 2 BSC(t) ) = l + h(2ee)-2h(e). 

We can then write 

I(W k )-I{W)>I{W 2 BSC{e) )-I{W) 

= HW 2 Bsc(,))- I (W BS c(e ) ) 

= h(2ee) - h(e). (8) 

Note that I(W) E (5, 1 — (5) implies e G (0(5), ^ — </>(<5)) where 4>(5) > 0, which in turn implies 
h(2ee) - h(e) > r](5) for some t?(<5) > 0. □ 
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